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Abstract 

The WumpLii Advisor program offer* advice to a player involved in choosing the bejt 
movo iti a game for which competence tn dealing With incomplete and uncettam knowledge 
ls re ^ u ired The design and implementation, of the advisor esplorei A new paradigm In 
Computer Assisted Instruction, in which the performance of compuler-based tutors is 
greatly improved through the application of Artificial Intelligence techniques. This report 
describe! the design of the Advisor and outlines directions for further work Our 
experience with the tutor is Informal arid psychological experimentation remains to be done. 
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I. Introduction 


Thr Wumpus Advisor grew out of a course w* gave in Educational Technology Co a 
small group of graduate and undergraduate students at M IT. Our goal w Ai to explore a 
ne* paradigm in Computer Aided instruction, in which the competence of computer-based 
tutors IS greatly improved by applying Artificial Intelligence techniques TO their design, We 
particularly wished to study I he structure of Intelligent Computer Aided Instruction (ICAI) 
programs that incorporate an Expert module which allows the tutor lo compare the 
student's response to those generated by the expert. In using the term ICAE and e* plot mg 
the consequences for a tutorial program of the avaliability nf an expert module, we follow 
the lead of John Biown, (Brown and Burton 1915), who has shown in his design of 
sophisticated instructional environments for electronic^ the promise of this approach. 

In Order to experiment with this paradigm,, an TCAI program for a simple game was 
implemented as a course project. The program serves as an Advisor to a player, offering 
advice and analysis at appropriate times. We chose Wumpus. a maretexploraliun game, 
because it represented the next Step in complexity beyond the tutor designed by Burton fr 
Brown for West, a simple game on the Plato system for exercising arithmetic skills {Burton 
E9t76J. Wumpus is motivating and require* a variety of skills covering planning, plausible 
reasoning, decision theory and Incomplete and uncertain knowledge 

The Wumpus Advisor was successfully Implemented by the students !n the course 
under Stansfields supervision, The program wn? later improved and extended by Carr H 
who is continuing to work an the project This piper describes the current Stale of the 



program which gives appropriate advice in English about the logic involved in choosing a 
best move. Four different levels of student are catered For but other than this broad 
distinction there Is little student modelling This aspect of the research is currentFy being 
developed. 

By studying simple teaching situations and modelling them with programs th^i teach 
we glJfi insight into the processes underlying learning and teaching The rich metaphors 
of computet programming heFp US to describe teaching and learning precisely and in detail 
while the discipline imposed by requiring a wording program weeds out impractical Ideas 
and points the way to better emit 

CA.I programs need models of situations and students if (hey are to understand what 
is going on and act appropriately. We must provide them with practical procedures for 
making decisions about teaching and give them a precisely formulated knowledge of their 
subject matter so that they ran interpret, model and act Ih a variety of teaching situations 
They also need atl espressive means Of communication such as natural language, display 
screens arrd tablets for both interpreting the students behaviour and making effective 
responses. 

Many early leaching programs and some current ones were "fact dispensing" 
machines. They used the “empty bucket" theory of learning, a trivial one in which the 
learner 14 stm-ply a receptacle to be filled with facts. Although this theory may he decorated 
with entra rules to present facts in special orders or in dusters, u is very naive and hAtdly 
iayt anything at all about real learning The key CMhpLIling concept which it. excludes is 
that of a process. The student should above alF else be learning how to do something and 


shcukJ l>e participating m various acUvJtta toward that end. Fie tl programming himself 
with the teachers assistance. fiy changing the paradigm from facts to procedures the whole 
enterprise is greatly enriched. 

From this viewpoint we ire forced to analyse the student's learning task and compare 
thh with his hehavt-otit, It becomes important to notice and correct the things he does 
wrong, forgets lo do, does unnecessarily 0T does In the wrong order Man? ideas from 
Computer Science are of great significance to this. The student’s task can he modllferlv 
decomposed into subtiska with individual goals. Those sub-tasks can he organized as 
PrwtJHt. coroutines or steps in a procedure. The vocabulary of Computer Science is rich 
10 precise concepts for describing this, Similarly, his organization oF Information <Mld 
methods must be ex a mined and debugged There are sufficient partially-formutated 
concepts in AI that deal with perception, natural reasoning, organising knowledge, pfenning 
and so on, for new descriptions to be made of the learning and teaching process 

The Wumpus Advisor develops the application of computers in education. It rs the 
first version of a program which helps a Student to learn a simple game called Wumpuj 
(Yob 1975! Acting as an interface between the student and Che game, if intervenes 
wherever the Student's moves show that he heeds advice. Advice is given as Erg Fish 
discourse explaining in full the merits and faults of particular moves Wumpus is played 
in a network, of tunnels Whose connections are initially unknown to the player. Ho must 
search this network avoiding dangers and trying to find and kill the dangerous and deadly 
Wumpui, Throughout play the advisor gives the studenH information about his immedlAie 
locality and evidence about nearby dangers. From this information it is possible to make 


plausible inferences and judgements which aid ip avoiding dangers. The game ii highly 
motivating tff children and excises sev^jal types of reasoning skill. 

The game paradigm fur advisors has also been researched by Burton using the game 
Weil {Burton and Brown W5). Wumptis is a more comples game and is a natural next 
step. In general, games form excellent subject matter for advice giving. They are varied, 
provide ntaeiv ation, and exist at many degrees Of difficulty. Some, such as chess, have 
targe bodies Of advice associated With them in the literature. Carnes are often models of 
real-world sliuastofis and develop abilities that arc useful in everyday life. Many of the 
strategies involved in the game of Co are of this nature. 

There ate five good reasons for using a simple game aj the domain of an advke- 
giving program, 

v 

1. Closure 

The rules are clearly defined. Since it is easy to describe wMl constitutes a legal move the 
student can always be expected to play within the rufcs even if he plays badly. This means 
that the advisor will be able to make sense of his inputs. With a less bounded domain it is 
easy For breaks in communication to OCarr because the program cannot understand the 
student 

2. Expertise 

We can easily design an expert player for many simple bpt interesting games. An expert 
gives a precise procedural theory of Che domain which we aim to teach 




3. Homogeneity 

For Simple- game* ihe same theory of good play applies. *ach move. The rules that the 
expert uses are good at all stage? of the game.. This glues generifiiy to th* teaching 

situation, A skill ts being taught whbch is exemplified ir different ways throughout the 
game. 

^ Simplicity 

[t is easy to find simple examples of games well within programming capability. 

5. Motivation 

The Silldent is motivated by a game whfh he may not be by traditional curricular domains 


These properties make it easy to sustain an interaction between the student and the 
teacher. Even with no advice-giving at all, the game scenario provides a continuing 
exchange. In a sense (hi? Js cheating for it makes it easy to write a ~(cy" program htil She 
important point it that we can .wart from such a position and] enhance the advLce giving 
step by step. This is the way people learn games in any case, beginning with (he roles and 
accumulating Strategies which cover progressively more situations 

Our general methodology was to find a domain which the computer tan deal with 
easily, which requires only Simple inputs hut which has a large set oF states. Games fit this 
well. Electronics does too as Sophie, the electronics advising program (Brown and Blii ton 
1975)-. shows. Sophie helps * student learn how to repair a faulty electronic circuit. A fauliy 
■circuit can be simulated. Moves correspond to mesturemcnis or alteration!, and, though 
there are only a few move type?, the possible hypotheses I ha I can be made abour a faulty 





Circuit are numerous and varied. Domains lOeff geography dt history are Sard W UW in a 
CAl program. They are very k no wledgE-oriented and lend not to St cloy’d Limited and 
well structured aspects oF them must he used if the domain IS not (0 expand continually or 
the Undent II nof to overreach the prbgrarVs knowledge (tee Collin? 19^* for promising 
work m this direction). 

A simple game tike WumpUi makes the task of writing an advisor manageable but 
does not exclude important features oF the teaching process. Models of the student* -ways of 
using them to pioVide relevant advice* queitifln* oF motivalion and! of not ovrradvising, 
can all foe studied even For a simple game- We have not programmed any student 
modelling facility yet in our advisor though the work we have completed Is a preparatory 
step. 

The itudent i* doing several things when he plays W Limp us with the advisor First, 
he I* learning how to play Wucnpus. An adaptation of the program COLlId also leach him 
variations and perhaps entirely different types of game. Ey learning Wumpu? he learns 
certain reasoning and planning methods These are of various types which we summarize 
shortly. At a more general level, the student is learning how to approach new game* and 
what methods are appropriate for unravelling thF consequences oF a given set of rules. 
This fi not restricted, to games. There are more general situations with logccal properties 
and rules and he might be developing a skill in producing effective procedures for acting 
in these situations. When first in a new situation one must direct the mod resources 
towards an understanding of the situation, A* skill accumulates, Fewer resources are needed 
and eventually tuning up and debugging Ls only done rarely. This as a general property of 


ikill aquisjfion. (See Sussman far a computer model of (his kind oF teaming.) 

The Corresponding aim of an advisor is to help the student team how to do all this, 
O Jr current Wutnpus Advisor only advijej on particular points of play so the Student will 
only bn,Id up general skills indirectly. Later* »c describe An approach that CAn he taken to 
improve the Wumpus advisor and consider decision making skills in more general terms 
showing how the Advisor might teach these. 

There appear to be several different styles of playing And thinking aboui Wumpus 
People bring a vatlety oF altitudes (O the game. Some play very safely while otPiers play 
with abandon fci the fdn of taking risks. Those who approach the gAine from the poim 
Of view of its logical structure are more likely to learn efficient play Jn a shorLer time than 
those who neglect this structure. On the basis of informal observations* Lhey appear to 
quickly absorb and benefit from the current program's style of advke. Players who w the 
game from other viewpoints might also benefit from Our advisor's Analytic approach which 
can be generalized Widely to other domains, However* the current advisor does not give the 
gradual and sensitive advice about logical rutes which must be provided for a jiudenc 
Whose manner of play is different from its own. Again, on the basis of ioforoval 
observabens, we find that such subjects Ignore king technical Advice because It spoils the 
fun of the game. A more appropriate advisor would understand tbeir ron? lva tion * and 
treat the logical aspect u only one Of several. This is an area which deserves considerable 


research. 


1-2 Analytical and Synthetic appro Julies to Ie a rn a n g games. 


When a student Is given the rules of Wumpus he must first analyse thpm to 
determine their implkidont, There ire several ways he can do this- Firstly he can 
experiment, playing a variety of possibly risky moves until he empirically determines Ihe 
regularities. In complex situations experimentation is combined with induet ion to gene rale 
and test hypotheses. A more direct method of analysis uses logit to inter properties of the 
game so that strategies can be developed to take advantage oF these properties This is 
very dearly illustrated in Wumpus. The player knows .some but not ah of the state of ihe 
board at any time. He can analyse the laws oF the game and can develop about one do ten 
precise rules of inference (hit he can use to help locate the Wumpus and avoid dangers 
He must embody (hese rules in a procedure for analysing a hoard situation and must use 
synthetic principles to do this. The Advisor contains an expert Wumpus player which has 
all of these rules already available to it. When relevant, it points out examples of the rules 
Co help the playpr make his move. The player is made to consider the corresponding rule 
and incorporate It into his play. 

Techniques of synthesis are used to construct programs and plans. Goldstein 
{Goldstein and Miller, L9‘7&) describes a classification scheme for plans in the context of 
Logo program writing. Typical examples are linear plan, recursive plan and parallel plan 
Acquiring skill at Wumpus can be seen as synthesizing a set oF programs, so different 
.synthesis techniques lead to different Wuropus playing strategies Many problems are 
encountered when asiembling separate pieces of advice into a coherent strategy, borne rules 
hive preconditions and may only be invoked in certain situations. A strategy which only 



appFie; Jft certain circumstances will otherwise give rise to had play. Ft is useful io explain 
errors In the student's model of play in terms of debugging and recognisable bog types 
The Student may then learn to recognise bug types himself and gradually build up a 
repertoire of repair techniques. 

I'S Mgth nd- i appropria t e to Wumpus 

Besides general techniques of synthesis and analysis there are those which are 
associated with particular domains. Wumpus includes two types of knowledge omilted from 
previous teaching programs. These are incomplete and uncertain knowledge. A WumpUS 
player usually knows only a. portion of the board and must develop procedures which can 
act effectively under these conditions Three general methods; decision Theory, probability 
theory, and planning arc useful techniques for this type of situation. 

t. Planning. 

To play a game well one has to plan and should learn to avoid certain planning bugs such 
as planning too far ahead or too unevenly. There are Often good reasons for choming a 
few candidate moves and restricting lookahead only » these. AJ has a considerable body 
of knowledge about planning In various domains and these principles should be Cawghl by 
4 good advisor. 

% Pension T heory. 

Because Wiimpus involves uncertainty and most moves have a combination of valuable 
and dangerous outcome* we can well apply the decision theory paradigm which is useful in 






many more general situation*. This theory Show* how (a assign values atid COSIS to 
properties of cuEccrbe? and gives a way of comparing these utilities when the outcomes 
occur with calculable probabilities;. It incorporate* a back-up algorithm that combines, 
planning With evaluating parhcul&f itates, 

3 . Probability. 

In any uncertain Situation probabilistic heuristics may be used !c advantage Estimating 
the probabilities of death at each move IS crucial Id good Wumpus play and our program 
uses qualitative probabilistic teasoning in its expert player and for giving advice. 

1.4 The rul es, of Wnmpus- 

WumpuS is. playod by one player, a Wumpus hunter, in a world consisting of a 
number of caves connected by tuhnels. The player moves around this warren trying tD 
avoid dangers and with the goal of finding and shooting ihd Wampus. Initially ihe hunter 
only knows the structure of the warren immediately around him. He knows thE number of 
the cave he it in Mid of all caVCi directly connected to him by tunnels. Evety time he 
makes a move, which must be into a neighboring dive, he is told the cave-numbers 
neighboring his new cave. The danger! of the warren arc pics, bats and the WumpUS 
which, like the player, are Initially located at random in the warren Any move into a cave 
containing a pit or ihe Wumpus results in instant death. If the player moves into a bat 
cave he I! carried away by the bats and dropped into a random cave which may of course 
contain danger. Bats are not fast enough to save the player from pits or the Wumpus iF he 
Inadvertently winders into a cave containing both bats and one of these haiards. They do 





{irry the player away before he gets a chance to jm what (he neighbors of (h F bat cave are 
ibDLtfll. There are clues which help in avoiding the harass The player hears squeaking 
it he is one cave away from a hat and he cart feel a breeie if he is one away from a pi,. 
He car also smell the stench of the WiunpUS from up to [wo caves away but car hoi Cell The 
distance directly. None of this evidence tells the player the direction of a ha card The 
hunter has a bow and five arrows which he can fire at any time into a neighboring cave. 
The arrow will ricochet at random through the warren for up to a distance of five cave* 
and will kill the Wijmpus if he is hit. It is possible that the arrow will by chance find its 

way back and hill the- hunter. A typbeal warren will contain 20 caves J bals, 3 pm. the 
player and the Wampus, 

1.5 A W nm pns Scenar io 

T he student's input Is In bold type the computer's output is in italics For the sake of 
readability we have taken the liberty of editing the advisor’s outpui by, for example. using 
pronouns to refer to caves that have aheady been mentioned. We could easily update (he 
program to effect most of these changes itself. In the case of pronoun reference, 

Wl nog rad's program (Winograd 1973) provides methods that are sufficient for the limited 
examples in this scenario, 

Hello ha. f see you art a beginning player, Yt>a have had J mins and 3 tenet. Let's begin. 

Van ar f at cave |J NrA neighbors 4, 14 r Q, oruf J. Brmt 1 feel a draft. Feu an rtrar a pit. 
What a itench. 1 met! the Wumpuit What nern^ 




4 


c 



1 uiiih you fttffc. You are af t^oe 4 urffd den*; 14, 6 t 2 „ and f? Brtrrf Squeak/ A 

bat is near.. What non? 


14 


This is « >?(»(. Kph- tapr krrrt tvm of 14 and you fttt d draft 4H t^A. 

This is tfcziblt fiiirffTicg so it fj flrp&rtfc that a pit (j there. What nasuT 


Atlvicr 


Cave S is a better cave te explore, W hat natof 


Why 


There is no danger from the H^bv^hj at M or 8, It a possible r^itir <? a pit it is 

more iikeh that there is dflf at (4, What tiM? 


9 


Kdm are at cave 8 with nri^Atarj !, JJ and 9, Wkat a Stench/ What snisv’ 











(Iff fl/ CAM J9 wJfA neighbor j 17 , S „ orad 6 StfUeakf What rq^E^? 


7 hit ti a good jtwhj£, ltd, since it Ji on itf way to the safe caves i flJtrf 9. You are at tave S 
with neighbors 1% l M 4 itd ft What a tinth.t What mrf 


1 




You are at cave I with neighbors 12. 8. and 9. What a stench? What now* 








12 


T’Aaf (j a r|f move The Cfttld be In 11 because ficiwn'f found any fdwj i'Art/ 

tu ntfrAift two cmtei if 12 f*s4 smell free- W£af ium? 


Advki 


Vm* t htntld 8dJi/ alt the wftifA putt Anc;i' me safe before you simt fating risks, Cave 9 a 
a goad came to explore. Like 12 it ii free from &r(j and pits; however, the Wurnpui fan nc-J be 
sf cave 9. What tsem* 


Mere 


If the Wumput were at cave 9, then cave R mculrf ie iMjrf to him and you would smell ftii?i at 
cave 19 gtif, you didn't smell A/wi at cave 19, so 9 is safe, WAsf itemi* 



Kpti are at caw 9 fiafil neighbors 8, ?, 10, inrf I. W Aof a stench* What ™ai* 


Help 


I can give you Information about token you are, where you have been, and! what ycii have 
learned at each cave that yea Aaw visited. I can visa give you advice What now) 




Advice 


Cave w t! a gmd to export. ft U fiauMt { fat the Wumpus if in taw 10, but that if 
necessary risk, There are m safer caves, ►VActT tcow' 5 ' 



Yrn are at cave 10 with netgfdoxi it, 9, 5, and 2. Brrrr! What a ftenth? What not** 

It 

it it possible that cave 11 emtclnt a pis, Would you tike to go ahead With Thai move* 
No 

Whas now* 

7 

You g gg not go dluttlj to cavt 7 from can* 10 fVWitf yui£ like a ro[£ r, i# cave 7? 

Y« 

£jO to 9 and then la 7, Shall t make the 






you are of cave ? raifA aei^Miort j£, 0, 9 m aflei it. W ha f new? 


9 


Kou are af cave 9 with fteighfairt S, 7, IQ, ar\d J. What a jfritcA.' Wftal Tiacy? 



fr'oa are at cave I a^/A ntfljAton !2 r &, anrf 9. What a stench! What rtpw? 

It 

That (j a fatal move. tVon/rf yaa JaAi advlcd! 








T he Wumfms i.t Okert by the JbUoving argument: 


CflW 8 must be exactly to away from the WumpUJ since if smells and cave 19 doesn't This 
Bifwflf that either cave K tb or 9 must be itflrt to Aim. 

Caw IS £j tm flnwj Since il is next to 4 which doesn't smelt. Cave 9 isn’t one away jinw you 
Visited cave 7 aiarf Jtef didn't smell, fft'ndTiaftorc, , cavt J ii one cjL'-dy. Hence, 

une of Us mighbutt -mujl Ae (fee WumptiL 

TAe neighbors cf 1 art 8, 9 and 12- You have visited E and 9 j$ h ^ elimination, the K'wmiicj 
it al 12. 

Shoot 

Whte.h iiar.iir anSiifd you ftfee to shoot into* 

12 


Congratulations you, .tew jtef ffee 




3- The strife! urr: of Ihe advisor. 


2-1 Matorcao abi lilies 

The- WumpUS Ad v l'.ij: has several ca pa bill ties organised around an expert WumpUS 
player that embodies a considerable amount Of knowledge about the famr. This expert can 
evaluate Che student's move, compare it against the best move and explain differences so 
that the student will improve hi* game. Future versions will ihtliide a model of the student 
is a perturbs}ion Of the -expert This urtll increase sensitivity Eg the particular problems 
facing each student gf the game. En this section we outline the structure of che expert, its 
Capabilities, its basic method af deduction and Its idling and explaining Strategic*. 
Section 3 cover* the detail* Of each Of those topics and section A outlines an improved 
approach developed by criticising our present effort 

Our expert Worn pus player has four major capabilities. 

I It deduces information about the state of the gaoiR from what |t I new* rhe player 
know*. 

2. It can evaluate any move that the player can make. 

3. It classifies alt moves according (O a Set of categories designed to capture the major 

Strategies of Wumpus playing. 

4. It* evaluation of a move is modular. 


At any time m a Wumput game the player can see a small portion of the warren and 




can remember areas he has vlined or has Seen from 3 visited cave. He has partial 
kmMp of the warren from this information. He ear, use hit memory of the Nation or 
bats he hai come aero?* and all the evidence from smelly breezes and squeaks that he has 
discovered in the course of the game. A good player should be able to deduce useful 
information about the position of vinous hazards by combining this information and 
using inference rules entailed by the rules oF the game. The expert makes most of these 
deductions, only using information the Student knows or ought to have remembered. In 
t.me. the advisor teaches the student to make all of th«e deductions himself in a reasonable 
manner and to use the Informstion discovered to make a best play. There are two broad 
Classes of in format.on CUr expert can deduce. First, ft can Often determine exactly the 
positions, of a bat, pit or the Wumpus, or can tell that a cave Is definitely free of such 
hazards. This is dearly important to good pl*y for haiards musr be avoided and safe 
cave* are worth investigating, Second, and very important in Uncertain and incomplete 
situations where definite facts are unavailable, the expert car evaluate probabilities, of 
haiards for any particular cave. Various heuristics are used for this and they represent 
qualitative knowledge about using evidence to nuke decisions 

Information gathered by these techniques is then Used by the expert to evaluate each 
possible move. All moves are treated independently There is no heed to phn ahead In 
detaLl since a move can almost always be made at any time If at all Only when a bat 
transfers ft player to a remote part ckT the warren do caves become inaccessible Lven in this 
ca*0 the warren U SO interconnected that jf is unlikely to he much of a handicap A move 
^valuation consists of a probability assignment for each hazard type and a simple measuie 


of tk information that would bf gamed by the move. Sn cave 3 may have a 0 3 
probability of a pit, a certain bat and definitely no Wumpui. It may he near the Wurnpus 
and so be likely to give information about it. 

The expert has an executive which classifies all passible moves according to a seven 
point scale of food ness shown in figure I and discussed m detail in section 3,4. Each 
category is a distinct type Safe moves are preferred to unsafe ones *ndi given two moves 
of roughly equal safety, the one which reveals mast information about Lhr warren and the 
Wumpui is regarded as the best. All moves jn the Fringe area are considered. These are 
caves which axe accessible but have not yet been visited. It is a waste of rime to visu a cave 
that has already been visited unless it is on the way to another profitable cave in the 
fflngE. Ef the player does visit such a cave it is assumed he is going somewhere valuable 
Unless he wastes too much time by going in profitless circles. 

The expert li composed of tour main units, an executive and three specialists, one 
each for bats, pits and the Wumpus. Naturally, from the symmetry of the game, the hats 
and pits expert are very similar and use similar deduction rules. Each specialist deduces 
what it can about its associated haiard and reports to the executive Modularity allows for 
a Comprehensible expert which is a natural advantage For teaching purposes. The student's 
play can be evaluated separately for each speciality and alio on their integration We 
expect that this wilt make it easier to construct student models. Il certainly allows the 
current advisor to advise about one particular module at a itme, 


EXECUTIVE aA 55 [F[CflT]QN 



3S THE CAVE SAFE? 

DOES THE MOVE 




1 give (NFflRhATjm? 


FRDft BATS 

FRnn THE 

OH THE 

(IN THE 

TYPE WO 
-> 

, 4 PITS 

WKPTJS 

UA RREN 

wurnis 

1 

YES 

YES 

YES 

YES 

2 

YES 

YES 

YES 

HQ 

3 

YES 

HO 

YES 

YES 

4 

MO 

YES 

YES 

YES 

5 

WO 

YES 

YES 

NO 

6 

hq 

NO 

YES 

YES 

7 

DEATH 

DEATH 

NONE 

NONE 


type no. 

ULflFUS VALUE 

bats a pits value 

1 

1 


2 

2 

0 

3 

3 


4 

1 


s 

2 

a c VAL < 1 ' 

s . 

. - . 3 1 


7 


i 


gafetv ha? Been given precedence The dats^ite value of 5 cave i a 
the probability of death by bate a* pits in that cave. The Uumpus value is 
l If the cave is safe from tho Uurrpua but will give If n forma t i on about r L, 2 
if it is safe but Hilt give no information, and 3 if it is unsafe. 
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2.2 Extra facilities 


Several extra hdlltltt have been added to the basic expert outlined above. The/ Can 
be thought of as. extra modules although the/ do not relate to the executive in the same 
clear way as the three hazard modules. All (hrpfi of the facilities we next describe could be 
improved great!/ and integrated into the advisor more cleanly. 

Wo include a Simple help ijhfrJ'aiiif which will offer the Student a good move when 
he is In trouble and w|f| also present an explanation of it If the Student desires. It is almost 
entirely a call to the expert for (he current best move. We mate no attempt IO supply a 
itiO’Ve which is tailored to the students current difficulties. ThiJ enhancement will only be 
reasonable when student modelling is implemented. 

Since the player may not remember alF of the warren he has dome across so far. we 
provide a route finder fptcioUn. [f he has any difficulty in reaching a goal suggested by 
the move suggesfer the advisor will offer a route through known safe cave* This is 
coupled with a help facility which gives the player Information about any cave he has 
visited on request. 

More important and most in need Of further development is rhe s fronting tpKiatilt 
whose jab it is to prevent the player from wasting arrows and to advue him to Shoot if he 
should be able to deduce the exact location of the Wumpus. It Will dissuade [be player 
from shooting if he has not located the Wumpus exactly or if he shoturs Into a cave that 
could not be the Wumpus, especially if ihcTe are other worthwhile things to be done, 
future shooting specialists ought to weigh up the risks of shooting, rhe value of the arrow, 
the possibility of hitting rhe Wumpus and the availability of good plays elsewhere. We 
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return to this when we conifer a decision tfwwy paradigm for fuLUre Wumpus advisors. 

SLi The advising paradigm 

The advising paradigm for our current program Ji a simple one. This is because wo 
do not yet have a component which effectively mates models of the student. Our system 
describes hits immediate behaviour and not (he reasoning that led him to this As a 
COO sequence, the advkwt will advise when the student makes any non-optima I move and 
will give him a description o( hts bad play which is Usually too full Nevertheless, there are 
some subtleties involved even using Dor simple techniques. 

While discussing the expert we noled that the executive classifies the student 1 s move 
according to a seven point Set cF categories (see figure l), We associate a program called a 
move-type-analyst with each type In this category set, The job of such an analyst is to 
comment Whenever the Student makes a move of that particular type. Each analyst will 
check to see ir the student made a mpve that was significantly worse than the best possible 
before it criticises him. The conditions for this vary according tn the particular type and 
this fj one reason for having separate analyst! Eia general the best moves are the ones wnh 
the lowest classification numbers and a drop of one makes a significant difference This is 
tiot always the case. For example move-classification 4 (Unsafe because of bals or pits but 
iaFe from the Wumpu* while giving information about it) is not always Significantly wm se 
than class 3 (safe from bats and pits but in danger from the VVumpus) even though in 
general a drop of one class doss male a Significant difference 

The comments made to the Student depend on move types as well as on [he particular 



board it* tr. Fl rst1^ r I be analyst commenls on the move type itself with some statement such 
AI "that is a risky move". Of course if there IS ho Safe move It will say ’’good luck." and 
leave the player to his fate hut often more .specific. rommeni is needed. There are two types 
of bad feature a move may have h thow that art avoidable and those That are not The 
analyst only comments on the avoidable ones, a property Which depends on the better moves 
available at the time. If the avoidable danger was a bat hitard the bats-expert would he 
called In to give an explanation of the hazard. The implicit assumption Is that the student 
did not see it. With a good student model we could distinguish between this and the case 
when the player noticed the haw.rd blit failed to see any better move The advisor focuses 
the player's attention and stimulates him. into finding a better move by refering Io Ihe 
haiard as a reason fot noL making the move he fried, It is also possible that the student 
found Other moves which were free from the Criticism but noticed faults in these thal he 
urns mistaken about or that he gave too much weight to. A good modeller Should allow us 
to adapt advice giving to cases like this. 

Having Criticised the player's move the analyst allows him to think for a while by 
asking him if he wishes to go ahead, The player can change his move and will then be 
offered a better one. On request from the player the analyst will compare Its suggestion 
with the player's move The explanation is comparative so no common fealurei of the I wo 
moves need mentioning. 

We have summarised that part of the advisor that currently fits nicely into a 
framework. Throughout the program are numerous patches that improve advice giving in 
ad hoc ways, Examples of such special eases are, advising a trout shooting, commenting on 


repeated iff Is take 3 and cautioning about time wasting by moving only into visited csurs. 
We hop? eventually to include these |n out theory. 

^4 SmiHiylb to ihg stude nt 

Although no student modelling h done by [he current version of the system chete are 
two comment* to be made about the way the program deals with the issue. First, some 
adaptation to student performance levels IS made even without active modelling. The 
Student is asked to rate himself on a four point scale of WumpUS hunting ability Tt would 
be Fairly easy to have the program actively make such CMrSe judgements over i period of a 
few game;. The rating influence* the advisor behaviour in Lhice ways 

»S provision for initial advice, 
b> pruning exp la nations, 
c) pruning the expert’s deductions. 

If th-C player is a raw beginner there are certain features of the game he might nnt 
have realised. For eKlrffple, bats are not a? dangerous as pits since they usually land you in 
a safe cave, Immediate observations Sllch as these are told perhaps once Of twice [o a 
beginner and are not mentioned again. 

The program can generate detailed exptonations by tracing Through the deductions 
made by the expert in determining iUCh Fads as probabilities of bats, ft is useful to ptUne 
this advice leaving only relevant Facts. The two most general approaches involve 



techniques not yet Included In our advisor. One Involves natural language dialogue. If the 
student were able to ASk the program for detailed explanations when he needed them, the 
advisor could explain in a fop-down fashion, beginning with the mam steps of the 
deductions and awaiting prompting few particular SUbJCepS. Il is passible To allow some 
form of prompting without a natural language capability if for each lower level step the 
advisor asks the student whether he needs an explanation. 

A second method requires a good Student model to determine what the player alfrady 
knows. We Incorporate 1 coarse version of this procedure, The student is asked to describe 
his kvel of play as a number From L to 4 The difference between a very good player and a 
novice is enough to justify ommietmg explanations of simple steps when advismg the good 
player. Though rhjj does hot solve the problem or over whelming a beginner with detail, it 
does improve Ihe Situation for a good player, 

Finally, we assume th*r one who claims to be only a moderate player will not male 
ary of the more sophisticated deductions or probability judgements that out expert can 
make-. In this case we remove the relevant deduction rules from the expert to bring it more 
CO the level of the playet Thu can be expressed as, "regardless of the student he must 
learn to walk before he runs’ Because of the modularity of the rules we can male this 
adjustment easily. The same property should aid us in designing a realistic student 
modeller in the furore When carried through this leads (0 Che notion of a "syllabus' 1 which 
is an organisation of (he leaching material that provides guidance for deciding in what 
order the material Should be presented. 


2 . 5 - Qgdoctjon da rAdtjfm 


Most IWoVrs Ln a of Wumpus yield information Which may he used as evidence 
for tacatinf and evaluating dangers on [he board. Wfl describe the detailed deduction 
procedures used for doing this in section J but it is worthwhile to make some general 
observations About the deduction paradigm we used. Wt UK four main headings for Crtir 
description. 

I) An assertlonaL data base, 

?) Antecedent theorems, 

3) Special representation of disjunctions, 
d) Mathematical functions for evaluating probabilities. 

The assertion^ data base contains information- representing the scale of the warren 
when It Is set op. It includes the connections between eaves and the exact locations of the 
player and the haiards. Initially, the player knows nothing about the hazards so we 
distinguish properties, and relations, which describe his changing view of ihc world as I he 
game progresses from the actual state of the world. The expert, of course, plays from (he 
players point of view although it is conceivable that future programs with mure 
sophisticated advising methods will "cheat* and help the player avoid difficulties he is 
unprepared to face. There are two types of proper tees and relations. One set of properties 
is a primary set including such properties as SMELL, VISIT lit), etc. It is assumed that ary 
player will have these as part of his Vocabulary since they Are so closely tied fo the way in 



which (be rules of the are presented to him. Other properties such as I-A WAY, 2 - 
AWAY h irt more remote. They appeared useful to us as we designed an expert. It is 
important to note that the Student might nut have these in his vocabulary until the advisor 
shows him that they are useful. Left to himself lie could come up wlrh a totally different 
representation for hli play assume that there is Only one good strategy and all the 
program's explanations are phrased in terms of the vocabulary needed for the inferences 
Involved in this. The hope is to set the Student thinking along the same lines. Et is 
important for future work id remember that different people may represent problems 
differently SO chit a hotter advisor must be able to determine a student's representation and 
model him accordingly. In Wumpus type situations it may he important for the adviiOi to- 
see how the student represents (he warren di&gramatically though, in general, multiple 
representations poses a very difficult question. To summariie r our program llteS Si single 
predesigned representation and attempts (0 impose this on ibe player. 

WirmpUS ts a sufficiently simple game that antecedent methods can be used to keep 
track of new deductions. Whenever any new informaiton appears Lhe expert draws all 
Implications It ever will between this and the old information. Thus we capture one aspect 
of a game piayef, He has a view of the game state which slowly changes as new 
information interacts with it. The expert has theorems which determine features of caves 
such as being one cave away from the wumpus being safe, Or containing the Wumpus. 
Some of these life simple, for example the condition that art arrow misses the WompLJS 
would trigger a theorem to assert that (be cave the arrow was fired into is safe. Other 
theorems have several possible triggering conditions because a feature of a cave can 


depend upon features of all- its neighbors. It abo happens- that a Lhcurcm may be triggered 
to prove a property already known CO be Crue. En order to prevent unnecessary chain 
reactions erf triggering an antecedent theorem always checks first to see if its result is true 
already. 

These deiign yifflfsofj ar* rojtrnwTt ifcTaccuJ’jrt^'-e to Al ^re.gTfltfltnfTS f-tpJh f on a near 
tight in flit flrfwtf gtidnf program. They are ftatutii nrS/cA canid improve a player's 
get itac tj hi vrgamstd his knowledge by r 

When the expert deals with bat and pit inferences it is interested in the provable 
locations of bats anti pits. This requires it to represent disjunctions such as "there must be 
a bat. in cave l r £ or 3'. We were led to use a special representation in terms of canefrrfiife 
jefi. In the example just given there would be a candidate set oF (cave] cave? caveS). Bats 
and pits deduction procedures were designed around this notation and manipulated using 
intersection, site and set inclusion, 

Evaluating the likelihood of a bat for any particular cave differs from the logical 
deduction process used lo find the exact features of caves since it involves probability. It is 
extremely hard and messy to apply probability theory exactly Co the Wumpus-situation AH 
probabilities are conditional on the partial Information already accrued at the particular 
stage of the game This leads to complex formulae ar best arid exhaustive comb i na tor i a I 
search ai worst, Our expert is instead a model of heuristic and approximate probabilistic 
reasoning of the kind that krvuw led gable game players use in common sense judgements 
about the game. We determined four general methods that might, well be used to estimate 
probabilities and adjustment the results to account for multiple evidence and the 


ph eriGmencn df evidence being explained away Qur rules embody simplifying 
assumption* and art generally Useful outside gf Wumpus, Though we expert that mn$r 
Students will use some qualitative analogue of our rules, the advisor represent* (hrm as 
mathematical formulae embodied in procedural. This has a quantitative nature which 
nuke* verbal advice hard tp give. The advisor overcomes this partially by pointing out the 
evidence jt u*ej. as data for its formulae and then saying that the Student should deduce it 
J* likely fprobable,. etc) that Che cave in question cunlaitis a haurd. We don't yet know how 
much advice giving about common sense reasoning can be bawd on 4 quantitative model. 

2.6 Generation of explanations 

The Wumpus advisor gives detailed explanations or its -reasoning. This leads the 
student to deduce useful properties of the board position and to use them when deciding on 
an appropriate move, Explanations are produced in a very simple way similar to shat used 
In $t*m*field {1975}. An explanation bear* an Almost isomorphic relationship CO the 
deduction procedure that is being explained. Each general rule of inference Jl associated 
With an explanation function. If the rule is of the form "A and B implies C", the 
explanation function prints out an explanation of the basic form t because A and ET. 
Since rtiles may be applied in many cases, many explanations can be produced by the 5-ime 
explanation function. Th» I* only thr simplest example of the method which is extended 
in two ways. First, A and R, the prcmisci of the rule, may themselves be cur sequences -oF 
Other facts and implied by Other rules. The explanation function for "A and B implies C" 
calls th# explanation functions for these rules and so on. Eventually a complete and 



detailed explanation of the inferetidng is produced. Second, each explanation function is a 
procedure and can easily hive idiosyncratic behaviour. One tciiiinic>n addition is for a rule 
to state itself as well as the particular instance. So we could have "Caves you have united 
are safe. You have visited cave 3 so it is saFe’. It would be passible by keeping a simple 
record to have the rule printed out with the instance for the first few times only. Other 
additions make the English output flow better and. occasionally, context sensitive aspects 
can be added. The program will usually refer to a VisLted cave a* ri <,aue x which has been 
Visited" but because of context might say "cave X where you are now". Up Co a poim, these 
embellish moots are easily added and the advisor has many. A general purpose English 
output program must be the next Step (see Slocum 19^ McDonald (forthcoming;), Riesbcck 
1975). 

Since the expert program is modular and contains an executive, the explanation 
functions fall nearly into classes. Some explain about bats and pits or about the lYumpus 
and SOrtie about the strategy as a whole. 

[t IS easy to tee from the example that the explanations become long winded and 
detailed To some extent their hierarchical nature cases this but It would be preferable for 
only the more relevant or important parts Of the explanation to be given to the student so 
chat ho is not confused by too much information. We could have included various ad hot 
technique* for pruning explanations which would have been moderately satisfactory. It 
seems more sensible from a research standpoint to first improve the student model so chm 
there IS a good bast; for judgements of relevancy. 


1. Program details 


1-1 B Ijl a ltdjjilsjnodullci 

The bats and pits module! of the expert embody about eight rules of Inference and 
use them to determine the positions oF bats and pits. They are of two kinds, logical rules 
which can be used to deduce ihe exact location of hajard-S, nod probabilistic rules which 
Can only estimate the likelihood of bats and pits In any particular cave. Both types of rule 
have a tread y been diioiued and here We describe them in detail. 

There are Four logical rules For bats. 

a} A squeak heard in any cave Implies that there is a bat in at least one neighbor of the 

cave. 

b) Visiting a cave will tell you whether that cave contains, a bat, 

c) If a cave does not squeak Chen none of its neighbor? ran contain a bat. 

d) If the total number of bats ii given, they can sometimes be located- exactly 

Rules for pics are almost identical, the one difference being that rule b) is of little 
use. If you fait in a pH the game Is over whereas a bat may simply carry you to a safe cave 
elsewhere Rule d) is fairly complex and is not Implemented, in our system. It works 
because If there are many more cave? next [0 known squeak caves and only a few bat! in 
Che* warren then only certain arrangement! of hats will expUm all the squeaks. The crucial 
point about rules a) b) and c) which a beginner may not immediately notice is that b) and c) 
may rule out possibilities suggested bjf a) to leave only one. Tn this case a bat or pit has 





bewi f Sadly Irc&teth Knowing- th t enact fcotten of a bat in such a maimer can in turn 
allow the probability rules to explain a^ay certain Squeaks neighboring that bat. This 
Could lead fhe rKpert to conclude that certain caves are wfe, 



figure 2- 


Consider the example In figure %. Cave* With ciicles around their numbers have been 
visired; caves I, E and. -I arc known to squeak; cav« 2 and 7 aie known not to squeak. 
Because of the squeak at cave ] t either cave 2, 3 or 6 must contain a bat by rule »), But 2 
cannot by rule b) (it has been vlillrd.) and & cannot because oF the lack erf a squeak at cave 
7 by rule c). This leaves only cave 3 ju (he bat cave. But a bat at 3 e*plains away the 
squeak at 4 so there ti no reason to SUSpcct a bat at || or 5. 

To implement the rules we use candidate sets. Firstly, the state of the board as seen 
by the player is represented In the data-base using the properties KNOWN-SQUEAK, 
KN0WN-NOT-SQPEAK, VISITED. V-BAT and KNOWN-NEIGHBORS. V-RrtT 



hi cans the nave has been visited and contains a bat which therefore carried the player away 
before fie saw the neighbors oF the c*ve. Nest a candidate Kt is generated for each squeak 
eave r duplicate sets being flushed. At least one bat must be in each candidate set. A unary 
candidate irt ii added to account for each ■visited bat cave, The sets produced lO account 
for f igure 1 2 would be 

(2SG) (3 IIS) (710) (10) 

MeJCI, rules b) and c) arc applied to remove -navei from candidate ICtJ Wc now have 

the sets 


(3) <3 II i> (10) 

Logically,. In our example, we have deduced that caves S and 10 contain bats. IF we 
knew that there were only two bats in the wanen, the unimplemented rule d) could, be used 
to prove that II and 5 are absolutely safe. 

At (his Stage, the logical rules are exhausted and the probability rules lake- over. 
There arc four probability rules, each corresponding to a fairly general rule For estimating 
likelihoods based on limited evidence, The rtlks are qualitative versions of the application 
of simple probability theory and Bayes' rule. We will describe each one saying a few weid-i 
about its, implementation The rules ate as follows. 


a) Equal livelihood 

b) Evident? can be explajn«l away 

cj Multiple evidence Can increase probability 
d} MullLple evidence can decrease wme probabilities 

Whenever exactly one oF a jet of equally likely outcomes must occur* Simple 
probability says thal the total probability must be I and an estimate can l>e made of (be 
probability of each outcome This rule applies approximately to any candidate set 
produced, by the logical rules. If the id has w members ehen we may deduce that the 
probability of a bat being in any particular cave is UN We can campfire the safety oF 
alternative moves because eaves in smaller candidate sete are more likely to contain bats. 
This rule is approximate for two r&soiu Firstly, (here may be two bits in any candidate 
set although for 4 large warren and few ha lards this IS unlikely to make Che rule 
inaccurate. Secondly, knowledge about the remainder oF the warren may iniluencF Ihe 
probability of it particular cave having a bat in subtle ways 



A particularly cormfloti way that [Mi second east arises it that a probable or certain 
bat in one cave explains awa,y evidence that supports a bats being In that cave as well as (ft 
several others- This ruI f- cars he applied whenever orte candidate set ns a subset nT another. 
Figure 3 shows a case wtth two candidate sets (l 2 3) and (I 2). The bat in (I 2) due to the 
squeaking explains away the squeak, at i that gave rise tu (I % 3) and there is no reason to 
believe a bat exists in 3. Evidence supporting 3 is explained away by the bat Ln 0 2) Our 
current advisor implements this by reducing the probability for 3. to the likelihood that a 
bat was pett in 3 by the program which -set up the board. 



figure &„ 




3f two candidal! sets overlap we have a situation of multiple evidence. Figure 4 
shows a caw where a squeak it I gave rise to a candidate set (2 3 i\ and a squeak at 5 to a 
Wt ^ 6 7). A bar at 4 would explain alt this evidence. Alternatively, two piece* of evidence 
point to 4 bu t only one each to Z 3. & and 7. We Implement the rule for I his situation by 
considering the probability of no tut at 4 . 

Ptbac at 1) - 1 - P(no bat at 4) 

* l ■ Pfba t in £2 S)) * Pfbat tn (& 7)) 

A genera! version of the formula can easily be-derived from this. 

This rule introduces a problem. If the probability of the common case Is increased 
then the total probability for each candidate set is raised above 1.0 which violates our initial 
approximation of one danger per cave The greater probability of there being a bat in the 
common are* should partially explain away the evidence and reduce the probaMibes for 
the Other cases. Since the eJcact formula for this would be cumbersome out program uses a 
rough formula Ed average out the discrepancy by reducing ill the probabilities by a little. 
This is che fourth rule. 

Xl) - 

sci\ 

\ 



figure 




Anothrr problem arises when more than one rule applies ai once. Figure b shows two 
caves, 1 and % both squeak and art both neighbors of cave 3. If cave I is also next to a 
cave which is known to contain a bat then its squeak is totally explained away and gives no 
further information It cannot be used in conjunction with cave 2 as a ease of double 
evidence for a bat In the cave connecting 1 and 2- This means that we must apply the 
explain-a way rule before the double-evidence rule. Such priority Constraints occur often in 
programming so we should not be surprised when a student needs to know them as part of 
the hli own program for playing a game well 

The four rules give estimates that fit the intuitive judgements generally made by 
players. The advisor states the factors used in the evaluation and gives a rounded off 
version of the result of its own formulae. It was unimportant for us that the student could 
precisely apply probability theory and we preferred that he be led towards making well- 
based estimates, The four rules we cue are suitable for this and are applicable in other 
domains. 

S.2 The Wiimpui module 

More complex deductions can be made about the location of the Wumpui than about 
bats and pits, Because a smell means that a Wumpus is within two caves rather than in a 
neighboring one It is weaker evidence than a squeak or breeze and gives rise to a much 
larger candidate s«c of possible Wumpus caves. On the other hand, absence of smell rules 
out more caves than wwld absence of the Giber type* of TVidcnce. Since smell-gonerated 



candidate sets have a li-dlUS of two caves it is possible that a neighbor of the smell cave a 
tin visited making (He candidate set Incomplete. It is also difficult to tell if ■moving i rofrt 
otic smell cave to another takes you closer, further away or leaves yon a?: the same disiance 
from the WumfUIS. All these factors lead to 4 more complex set of inference rules than we 
need for the hats modules. 

There are rwo simplifications which make the problem tractable future programs 
might cover the more general case and It would also be interesting; to Vary the type of 
Wampus evidence {intensity of the smell with distance ftom the Wutnpus or number of 
Wumpl for example) CO See what rules would then br needed The two simplif ica lions we 
have made ate as follows. 

0 The expert only makes logical deductions about the Womptis and not probabilistic 
judgements. 

2) In the original game Che Wampus may move when an arrow is fired which masses 
him. The Wumpus is fitted in out version. 

We examine ways to make probabilistic Judgements about the Wumpus later If the 
second Simplification ts relaxed and the VVumpUS Is allowed lo mov?, older evidence would 
be degraded but would not lose all [ts value. A smell cave which before a shot had implied 
that the Wumpus was within two caves, would now mean he must now be within three. A 
mo-smelt cave would now guarantee only lhat he 1 * not Ifi one of the cave's neighbors. The 
increase in variety of evidence would make the rules more complex. 


We Use fi*c major Wumpus finding rules. Each Is further away From the rules of 
play than the bats rules are and requires some simple prooF ol Jts correctness which 
naturally should play a part in the explanation of the rube given by the advisor. The rules 
are methods for deciding one of five properties of a cave namely , SAFE,. TWO-A WAY, 
ONE-AWAY, WUMPUS, and MORE-ThtAN-ONE-AWAY. 

Rule h GOAL - To prove a cave is SAFE 
A cave is safe; 

a) Lr it has been saFely visited 

b) If a,n arrow has been Fired into the cave and no Wumpus was hit 
c} If there is a NO-SMELL C*v< within two caves of it 

This rule is easily Justified and ts invoked whenever one of the properties, VISITED, 
MISS,. NO-5M ELL is asserted about the cave in question. 

Hole 2 : COAL - To prove a cave Is MORE-THAN-ONE-AWAY 
A mve is more-1 han-one-a way from the Wompus; 

a) if we can prove It to he two-jway 

b) Lf it doesn't smell 

c) if a neighboring cave does not smell 


a) is obvious and b) and c) are simple since if a cave were the WumpUS OT one away 




then aII of Jtj neighbors would smell. This rule 11 not rchumtiv*. Thete are probably 

Other ways to prove mwe-LhaTi-one-awajmess bur Hi use is limited to these special case; as a 
help to later rules. 

Rute & GOAL - To prove a cave is TWO-AW AY 
A cave is two caves from the Wumpus; 
a) if it imellsand It i? mote-than-onnway 

t>) if It im-elU {so ail the neighbors arc known) and rone of the neighbors is the 
Wuntpus. 

Both parts of this rule need comment Bub- a) depends on the configuration shown 
In figure 6. 


SHELL IfO-SrELL 



Cove 1 iMJBt be exactly tuo frot* the Uunpua, 

11 guru 6, 

Since cave 1 smells it is within two Caves of the Wumpus and most be either one or 
[wo caves away. But cave 2 must be more than two caves away and, as I and 2 are 
Connected, the only cuusisTem case Is for cave | to he two away From the Wumpus. Roth 



caves must. be visited for thU rule [a be applied and the rule is triggered when any SM ELL 
or NG-SM ELL cave is discovered. 

Case b) succeed* by proving that the cave Li more-than-one-away Trom the Wumpus. 
Since It smell*, it is either one at two away and so must be two away. Notice ihat rule 2 
doe* not help here. Instead, we prove that no neighbor is the Wumpus cave SO the cave In 
question is more-thar-onc-away. This rule IS triggered when, any cave is. shown to hr safe 
by ruk- I. All neighbors erf' the new safe cave are checked for smells and any cave which 
does smell has the rule applied to IT. Alternatively, a new smelt cave may trigger the idle 
If either case of rCllo 5 SUft«ds it will trigger rule %. 

Role * COAL - To prove a cave Ls ONE-AWAY. 

A cave is One away from the Wumpus is it has a neighbor which is two swjiy and all 
other neighbors of that »v#are mcre-than-onc-away. 



figure 7„ 




FlgUrf 7 Shows an fsamplp in which Cive £ must be one away fioni the Wumpus. 
The reasoning is as follows. By rule 3a). cave 2 » iwo away. But we know all its neighbors 
and one Of Them must he one away. Cave } canhOt be, by rube 2H and since cave 5 Is two 
away„ by rale 3*)y cave 3 cannot be one away cither by rule 2a). By process of elimination, 
this means (hat cave-fi must be one away. 

Notice that rules 3b) and A) are similar ic the baii and pits logical rules. First a 
candidate set Is generated in which at least Orw element has a desired property Then all 
members arc deleted and the remaining possibility becomes a certainly. This technique 
could he called "reasoning by elimination’ En the bats caw 1 he property was directly related 
to the game rules whereas the Wtlmpus rules require some thought to discover relevant 
properties such as ONE-A WAV. It would be interesting to sre rtf we could design an 
advisor that would lead a student to- develop these Wumput rules from the bins rules and 
to realise that reasoning by default is a commonly Useful method worth identifying and 
naming. We leave it to the reader to see how the method generalises to give roles for 
detecting Wumpi who smell more and can be detected from greater distances. Rule 5 also 
uses reasoning by elimination 

Rule fr GOAL - To prove a csve contains the Wumpus 

A CiVe Eh’List contain the WumpclS if it has a neighbor which is erne away ftorttl the 
Wumpus and all others neighbors of that cave are Safe 


We CJ.ii see an example of rule 5 in figure 3 r an extension of figure 7 Suppose the 



player visited cave 5 and discovered It smelted and connected with 3 and 15. Since & is one 
away and neither 2 nor 3 ti the Wumpus by rule sheave |D mu« be the Wumpus. 



Myc-e a. 


3-1 General comments on [tic Wumpus module 

Despite the Simplifications W? made, the rules for Wumpus (nmtin| are still complex. 
There are common elements and the rules Inter-rHate bp triggering each Other at several 
paints. Nor are the rules complete. We could Use the fact that there is only one Wumpus 
to help locate him. Figure 5 is an extension of figure 1 where we visit cave 6 and discover 
the new neighbors *f and 9. The 1 Wumpus must he one of these. But we have only one 
arrow left and daren't waste it. So we visit caVe 5 and discover neighbors & and 9. We 
have two candidate sets for the one Wumpus, (7 5) and (E 9). He must be at 3. 



C &Vsii 



Such a large body of knowledge makes advice giving a dtff iculr problem. Ollr 
advisor applies the nibs, detects any instance In which the Student could have made a 
better move and. prints out a protocol of the rule's application. This ti&ive tutorial 
technique could be improved m several ways. Flrsi f care needs to be taken over the 
■distinction between a rule and its instances. Our advisor follows the paradigm of teach mg 
by example It should also teach by giving general etiplanations. Second, the rules inter¬ 
relate and it is non-trivial to organise them all to simplify their application It is possible 
that a player knows all the rules but is muddled about them in practice Thirdly, wf build 
no model of the student's knowledge so it U impossible to debug him when be uses an 
incorrect version or a rule. He may prove that a cave is two away by using rule 3si) but 
then think that all smell caves next to it must be closer to the Wampus and must be one 
away. We need a way to classify, detect and correct these errors. 

Just as our expert could make qualitative judgements about the probability of bats 



and pits, It is possible to Introduce rules for |UdgHlg the likely fccallnn oF the Wumpiii. 
There are two ways to do this. We can male use of the similarity between Wumpus 
hunting and bat finding where reasoning by elimination is used to set up candidate sets. 
All the probabilistic bat rules wt(| (hen apply to the candidate sets. Rules 5a), i and 5 give 
rise to candidate sets for the properiieS TWO-AWAV, ONE-AWAY an A WUMPUS 
respectively. There IS a (ransillvUy phenonKnon too. Probability results from rubs 5 and 1 
can be used as evidence in rules i and 5 respectively Here is possibly a genera! pnnrjple 
df plausible reasoning. An exact rule has a probabilistic counterpart for use when 
Incomplete or uncertain evidence is fed into It. This would provide a nice basis For an 
advisor whose goal was to leach plausible reasoning by weighing evidence 



A second totally difFerent strategy for maling probability judgements is possible and 
gives rise to farther principles oF plausible reasoning of Very general application. Given a 
board State iuch as that in figure 10, we can enumerate several hypolheses For Lhc location 
of the Wumpus. Consider for esample caves I, 5 and 4. Each of these hypotheses will 




explain away some of the exiden ct in the figure. None of the hypotheses ,& totally 
discounted bllt Cacl1 1 different set of extra properties to be true of the board 

Which are sd» to be letted. A Wumpus at e would explain *11 the smetls and also the 
smell/no-smell pair at 3/10. If needs no extra fhmgs to be true of the board. Hypothesising 
cave 5 however, does not explain the smells at 2. 3 and 1. It thus needs extra board 
connections and these may or may not enisL Some measures of the evidence explained and 
the extra constraints imposed on future discoveries dan be used (0 compare the likelihoods 
Of various hypotheses. Both measures are needed. Constraint measures can he used to 
compare hypotheses and the explanation measure provides some absolute measure of 
confidence. 

3.4 The executive's mnve classification 

The bats, pits and Wumpus experts are used to determine the probabilities of 
meeting a hazard in any particular cave. This information must he used by the executive 
to evaluate a move. The executive forms the strategy component or a Wumpus player but 
since the game requires little lookahead, planning Strategies are hardly needed. Each move 
can be evaluated on the basis of the current Mate and the available alternative moves. Two 
Strategies exist and a players behaviour can follow either Or both for several mov*i even 
though he makes all his decisions move by move The strategies are called "playrng safe - 
and "gaining information''. Wasting time c»n be thought of as a third but n * degenerate 
case of the first md the advisor deals with It impatiently. 

Playing saFe mean* making the safest move you can find Clear(y the safety of a 



cave depends an the probability of if containing a haiard and this is reported on by the 
resp-Kttve eKpertJ. Pits and Ihe Wumpus mean certain death so they at 1 ? easy to deal with. 
They are independent and their Joint probabilities for any cave can be computed. A hat 
may be relatively safe Since it does not necessarily leave the phyer ■■■ a deadly cave. The 
executive estimates the danger by using a simple formula which we wilt derive if the 
number of caves is N. the number of bats b, and ihe number of pits p, ihen if we asvurue 
that Ho tavc contains more than one haiard (a good fcppronima!ion if N is much larger 
than p and b) we can reason as follows. 

Pfdcatli by bat) 

-P(youj land on a pil) 

*P(yoil Iknd -on the Wumpus) 

*P(you land on A batJtPfdeath by bat) 

Pfdealh by bat) - deadly taves/non bait caves » (p*lV(N-b) 

This works because after being dropped by a bat In a bat ca.ve again Ihe chances of 
death are the same aj they were Oft first moving into a bat cave, Another way of chinking 
of this would be to sum an infinite series with a term for each total number of bats it is 
possible to land Oh in One move. A third way tl to realise that tbe process of being moved 
a hoot by butt must eventually stop Jn a non-bat Cave and there is no reason to pie Ter one 
over any other so the chances are equally likely 


We explained the d-eri of (his formula tn such derail because it Is aft 
opportunity to consider The amount of knowledge about the application of probability that 
a perfect advisor might need to explain The moral is cautionary. In practice our 
executive simply evaluates t he formula and states ihe livelihood of death as a part of its 
explanation of the danger in a C*ve. The Student is expected to come to some similar 
decision qua I natively and to improve his reasoning :o he comcidrftt with the adviser’s. 

Shooting arrows is also a tricky type of move to evaluate. Our executive onfy deals 
with tblJ In tpeclal Cisci when the Wumpus is either located or known to he Ift a different: 
direction from the shot. A true estimate of the risk involved should include the probability 
of hitting the Wumpus since arrows can only be dangerous when they miss. 

The EBCDfid strategy for play I* to gain information. Again, a move which has been 
made before gains nothing and the strategy degenerates mio Mme-wastifig Information- can 
be gathered in two mam ways. Moving to a new cave gives information aboul the warren 
and perhaps also about bit* and plls However, new information about bats and pits can 
hardly be predicted. If a cave is suspected of being a bat or a pit,. discovering that it is not 
Could allow inferences to be drawn about the actual kHatlOh oF the hazard tn Wumpus. 
examinations in such detail are not very significant but it is easy to imagine real-world 
situations where a risk is worth taking for the negative Information that rosy be obtained 
A naive Wumpus player may rush into dangers for this rpason and the advisor will caution 
him, Since Wumpt can be smelled from two caves away and as certain caves can be 
deduced to be two away Jt is possible and often safe to move into a CRve (bar has a. good 
chance of giving information about the Wumpus. Again, ihe true value of the information 


*an Only be gauged hy considering' the inferences it would allow. For our purposes 
limply distinguish between ‘possible" information gain and “probable" gain. 

The two strategic! interact so that a decision theory model is needed to compare 
accurately the informal Len gained with the risk involved Since the version of Wumpus we 
lise places no time constrain!j on the player, our advisoi makes safe play more important 
than informative play Before describing the mechanism for ibis, consider the following 
ok ample of a case of compte* evaluation, tn the beginning of the game it may be useful to 
take a bat to reach new parts of the warren, especially if all other moves in the localily are 
dangerous. There are relatively few pits &Q jt Ji unlikely that death will ensue. Later in She 
game the safety of 1 bat is unchanged. At this stage, most Of the warren might have been 
investigated in which case the Information value of taking a bat is lowered considerably. It 
may no longer be worth the risk. It is possible for the player Co be completely trapped SO he 
can only make deadly moves or repeat his old ones. In this case ihe value of taking a bat is 
that it might drop you in a new situation even if this had been visited earlier. A decision 
theory and planning theory of WumpUS could ill future be the basis of an advisor for this 
level of play. 
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FLgure tl shows the move classification scheme used by the current executive to 
capture the two strategies. Firing arrows and using hats to gain information have been 
excluded From the evaluation. Safety is faltered into safety from bats and pus, and safety 
from the Wumpus. There arc seven classes of move excluding repeat moves and they are 
numbered roughly in order of goodness The seven can be divided into groups of three, 
three and one The first three are totally saTe from hats and pits as proved by the experts. 
Types 4. S and, 6 are unsafe according to bits and pits and type ? is certain death. The two 
groups of three are similarly organised according to WumpUE conditions. Best nf all are 
moves known to he safe but next to smells and therefore Wely to reveal information about 
the Wumpus. Second are those caves which are safe from the Wumpus but unlikely Eo 
give information about it. Finally, we have the caves which are unsafe from the Wumpus 
and therefore likely to give information about it. Each move type 4, and 6 can be further 
faulted according to the actual degree of hat and pit unsafeness. 

The classification Is effective and to some extent distinguishes the strategies and 
places them In order of safety. It also clarifies the advice-giving role of the executive for 
as, we shall describe, each move type has a corresponding analyst which Specialises in 
advising about moves of that type. 

There are difficulties in capturing the interplay between Strategies in a classification 
scheme. Consider move types 3 and 4. Both provide the same kind of Information so their 
ranking C*n only be determined for particular moves by the relative dangers involved 
Again, classes 4 and 6 give the same Information and under oeifam conditions each could 


be better than Che other. A better viewpoint is to consider decision-making under 
dangerous conditions to be 4 decision theory problem., The expert should be able Co 
compare risks and profits and Its raplanaltofts should be m these terms. 

i.5 The flow of control 

Figure 12 shows a simplified Flowchart for the system Whenever the program 
requesti a move, control ti at point A at the head of tfw flowchart. Certain jpetial case 
such as sh-DOting and requests for help are dealt with by special programs. Otherwise* Che 
expert is called TO classify all possible moves, in particular the one The player actually 
wanted tq make, and control is switched (o an appropriate analyst for the player's move 
type- Analysts consider the available fflO-vci to decide if the player made a ^t"xi movr If 
he did it allows him to go ahead but otherwise It explains why the move was bad, partly 
using its own explanation functions and partly using those ajsotraced with the individual 
specialists for bats, pits and the Wumpus. 

The player (j always allowed (be option of proceeding but if lie w|she* to change his 
move he Is offered advice, When accepted, this cakes the form of a good move and an 
explanation of The benefits of this move over-the player’s. 




Flow of control 
figure 1Z. 
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Figure [3 show* (hr schema for movr-type analysts. It is self-explanatory except ror a 
few points If two moves are oF the same type they may Or may not be of suitf iciently 
different quaHty to Invoke advice^giving. Since move-types i, & and. 6 have a range of 
safety from 0 to I, one move can be very saFe while another of the Same class IS very ri&ky 
Second. the explanation functions are context sensitive. A move which is dangerous boih 
because of the Wuimpus and pits would nut always give rise to an explanation of lh«- 
Wompos danger, tf no available move was safe from the WumpUithe advisor gives the 
player the benefit Of the doubt and assumes he has seen this. 9t assumes hr chose the 
Wrong move twtaUie he emitted te tale proper iccoont of (he difference in pit safety. 
These assumptions are a recent addition to the advisor and we only dtKOveted the need for 
them by using the program. It Is remarkable how interaction with a program reveals 
glaring design omnnsions which would otherwise be urmotlced, 

Together the specialist modules for bats, pits and the VVumpus, the executive and the 
adviee-^iving camptnhenLs of each make up the majority of the advisor, 


'f- A decision throry approach. 


The cxcciAtwt module of the wurnpus txpm represents various types of danger and 
the ways information can bt gathered by means oF a table In effect, ail the decisions about 
trade -off? between risks and & ins are compiled. This method ,* restrictive and some 
subtleties df the trade-off* am omitted. We now describe a more uniform and general w*y 
Of deahof With such decisions that will be suitable for an Improved version of the advisor. 
It i! based on decision theory which is especially designed to represent problems or choice 

in uncertain situations like W UmpUL The analysis of a problem using decision theory has 
three components. 

1- A el ects ion tree. 

This is a tree Of states of the world rather like a lookahead tree far game theory or 
planning. It Ji rooted at the initial state and at each stale the player is given a set of 
alternate actions from which he may chos* one. In W-umpus a state represents the position 
at a point in play and the choices facing the player are his legal moves. Tor any move the 
player makes, the world can respond in a variety of ways ami each has an associated 
probability or occurlng. If rhe player moves into a risky cave then two possible outcomes 
*re that the cave actually contains the danger or that it does not. A, more detailed 
description of the outcomes might specify the possible new neighbors that might be 
discovered. A decision tree thus has two types of Arc. those corresponding to the players 
choices and those that correspond to rhe World's The dr! y difference from a game tree Ls 






the special way thac the player's opponent behaves. In game theory he tries to make the 
best move whereas In derision theory he behaves according to probabilities that can be 
estimated. 

2- An evaluation function For terminal node* of the decision tree. 

The terminal nodes of the decision tree have values for the decision-maker which 
■can be evaluated if some procedure for doing' so is specified. This procedure must take 
into account aH of the good points of being at that state and weigh them against all of the 
bad points, It calculates trade-of fs. The most common method is ro measure each cost or 
gain with a single number and to combine these by simple linear weighing. The value of 
each feature is multiplied by a weighting factor and local led with the others. If a feature is 
very good or very bad then It has a larger weighting factor either positively or negatively. 

3- A back-up function- 

Given a tree of possibilities and values for each of the terminal nodes it remains only 
to decide on the best action to take at the initial state. It is possible to work out what 
expected utility each action has hy wnrking baric wards from the terminal values. Suppose 
we have a state which allows several actions each of which has several outcomes alt of 
which are terminal. Ws know the probability of each outcome for a given action and «t 
know their values since they are terminal. The exported utility for that action is easy to 
evaluate using simple probability theory. Which action should we choose? Clearly the one 
with the highest expected utility. This means that the expected uillity for ihe state is the 
highest of the cvpccted utilities of the actions available aL that slate. Now the state can be 
considered a terminal state Since it has been Valued and we car continue backing up the 





tr« until we d-SterOnnc which action CO take From our starting Hate. 


This approach to I he analysis of * decision problem 4Humes that the value of a state 
can he determined from the 'Values oF its component features Four of these oonripnnents 
clearly occur In Wumpus. 

L frisk of death 

The Utility of dying should be very large and negative. It cannot he minus irFimty 
since this would multiply by any probability of death to be minus infinity Instead, utilities 
could be a function of the probability of d-e-tth. There are various- ways that death can 
occur, failing into a pit, wandering into I he Wumpus, shooting yourself with an arrow, or 
being carried away by a hat Into a dangerous plate. These possibilities reveal themselves in 
the decision tree. If a iEudem fails in account fur any of them it is ref lotted in his 
incomplete decision tree. The probabilities of several of these cases arc quite tricky k> deal 
with. 

information gam ' 

The amount and value of information gained by any move are Important The 
value depends on what u already known as new facts may allow important inferences 
information Ft>ay be gained about the warren itself and about the dangers m it In 
variations of the game where the Wumpus may move it is possible to fuse inf-urmaliEjn 
Inferences miur be dealt With by a set of logical and probabilistic rules such as We have in 
the existing advisor. 




9. Coal 


The ultimate goal of the game is obviously in important consideration in d reading 
upon the value oF a state. It is not sufficient to make safe moves or to Mod out 
Information. It Is also Important to VIII the Wumpus. Killing the Wumpus must thus have 
a high positive value. A small chance of killing Jr may he better than a large chance of 
gaming information. In variations of the game It would he possible to Injure the Wumpiis 
perhaps slowing him down If he can move around the warren. 

1. Resources 

A very important value in real-world situations Is the value of resources This wa* 
alter all one <rf the math reason* for inventing money. The only resource used in our 
current version of the game is a supply of arrows It is clearly veTy illly to taVe a chance 
with y<HJf last arrow though It may be worthwhile testing hypolhettcal Wumpus locations 
with the first few. Many other resource type* COUM he added CO The game, time constraints 
toeing one of the more general Given a Fised time to play before the warren falls in on 
you will affect your play. It would become had play to waste time. A toote interesting way 
to introduce time is to make the Wumpus actively look for the player, eating him when it 
finds hint- This could hecome a two player game with the advisor watching or else the 
advisor could he one oF the players. 

From the discussions Of each Of these components it U easily seen that Wumpus can 
have many interesting variations And all of the variation* will easily fit into the framework 
of decision theory. A newer advisor based on inch an approach would he able n> advise a 




uifir about playing all the different variations So far out goal for Che advisor has been to 
introduce people to a situation in which the implications of a few logical rules are 
important for sensible decision making In particular we chose a situation which had 
uncertain information This naturally leads to the extension oF teaching decision theory. 
When we consider this we discover at least six types of hug a student may h&ve which 
directly concern decision theory some of which were out of the scope of our current adviser. 

1. Fail u re to judge uro ba htMtes. 

Failure to determine the likelihoods of the various outcomes of an action will cause 
errors when trying to back Up the decision tree. 

% Inapp r opriate utility functions. 

The- Student may have utility functions which are inappropriate for wmnttig the 
game. He may think that pits are less dangerous than tint Wumpus for example Or he 
may be playing the game according to a strategy which requires a different set of Utility 
functions. He may wish to fall mth pits to help him remember the result of Such an action 
Of to check his hypothesis about what will happen. He might also be mote interested in 
playing for fun than playing efficiently. An advisoi that could recognise and relate to this 
woutd heed to take account of the player's values accordingly 
^ Failure to see a ll the alternatives 

Expressed In Che decision theory paradigm this t?Lrg correspnndj to an incomplete 
procedure for generating a decision tree 







4. Refusal to cut los:cs. 


This dues not occur in Wumpus because there are no long lerm plans involve ]t Is 
however a common bug which manifests itself in a distorted set of values Pas* tosses are 
weighted too heavily and actions are taken which have only a small probability of 
annulling therm. 

5. Myopia. 

A decision tree which is n« deep enough will give rise to short-sightedness. Small 
immediate gams will be presetted to long-term ones. Large long-ierm losses will not even 
be considered. 

6. Preoccupation with details 

This is related to the myopia bug bui instead of (be tree being too shallow it is one¬ 
sided. All the planning resources are used to plan ahead on only a few paths. The result is 
that when a move is eventually made It Is either on the wrong track or based upon too 
shallow an invesiigation. 

Wumpus has very simple strategies for play and though this was one reason for its 
choice it ii perhaps time to consider what additional properties we would like a game to 
have for cnJr advisor to teach In an interesting way. The simplicity of Wumpus largely 
arises because all decision making for a move can be done at Lhe lime of the move with 
only the information available at that time. Each move is made separately. Unlike chess, 
the player does not need to make up strategies which govern the style of his play for a 
sequence of move*, Nor Jtre there ploys and trick methods which help lead an opponent 





JMO an error. In short [he Wumpus expert need* TO do no planning ahead more than one 
move. The basic cycle of play is lo mat? inferences from current knowledge about Ihe 
current state of (hr board, pinpoint the dangers, choose a move to avoid these dangers* 
mate the move* thereby gain information and finally go to the beginning oF the cycle 

A more advanced game Would combine Incomplete information with need Tor 
planning. Ewk-aheid would be necessary along sequences of action! each of which might 
have an uncertain outcome. There should be different methods of play shat are applicable 
in different situations. Since evidence gathering- n a* important as evidence weighing, the 
game situation should allow the player to design a set of methods or strategies for fining 
information. Action in an Uncertain situation it 1 feedback loop. Evidence is gathered and 
weighed and plans arc made both for acting and for gaming new information The pla mi 
may be based on hypotheses, and information gathering should he designed to test these 
hypotheses as well as possible. One possible candidate for a game is the game ’Chic". A 
murder has been committed and each player tries to play the part of a detective and 
discover three pieces of information, the weapon* the place* and ihe culprit, Each player 
hat Certain informants and by combining everyone! ii would be clear whar the answer was. 
A player may only get a limited amount of information from another at any one time He 
thus has to make up Strategics to dEtemlne the informal ion he requests. Other players 
hear every player's request but do not khow ihe implications of rhe answer fully. Players 
have to move around a hoard to particular locations before they can ask particular 
questions so an extra COM is involved and Plher players may be able to infer things from 


this behaviour. 


Whatever game IS chosen it Will be necessary to combine planning with decision 
theory. Feldman '{I9 f 75) has shown how this tatl be done. The principle Is easy 10 describe. 
A derision tree is effectively a planning tree showing all the possible plans. The results of 
actions in these plans are uncertain but provision is made for each possible outcome 
Instead of looking for the Utility of a terminal state and moving so as to me lease your 
expectation of this value, all the steps of the plan have to be taken into account. Each step 
has WHS and gains associated with tt and they most be added up h> determine the value of 
the plan as a whole. Then the plans can be compared and the best one lalert All 
important feature of planning in an uncertain situation is that plans must he revised after 
each step is executed since new Information may change the situation. 

Summing up, it seems that decision theory provides a rich Framework for 
Improvements in the Wwnpus advisor. In particular, the problems associated with making 
complex decisions involving conflict* of goal, limited resource*, and uncertain information 
arise (0 9 form which can he taught usefully by an advising program. These problems 
confront people often- in everyday life when They Interact with others and when they try to 
make plans for the future. Although an advising program willten at this early stage will 
not teach them how to cope With more than a coy Situation, it is a step towards a deeper 
understanding of teaching in this Area. 
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